coverage probability
- North America > United States > California > Orange County > Irvine (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games > Chess (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Switzerland > Geneva > Geneva (0.04)
- Europe > Belgium (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
We establish stability of random forests under the mild condition that the squared response ($Y^2$) does not have a heavy tail. In particular, our analysis holds for the practical version of random forests that is implemented in popular packages like \texttt{randomForest} in \texttt{R}. Empirical results show that stability may persist even beyond our assumption and hold for heavy-tailed $Y^2$. Using the stability property, we prove a non-asymptotic lower bound for the coverage probability of prediction intervals constructed from the out-of-bag error of random forests. With another mild condition that is typically satisfied when $Y$ is continuous, we also establish a complementary upper bound, which can be similarly established for the jackknife prediction interval constructed from an arbitrary stable algorithm. We also discuss the asymptotic coverage probability under assumptions weaker than those considered in previous literature. Our work implies that random forests, with its stability property, is an effective machine learning method that can provide not only satisfactory point prediction but also justified interval prediction at almost no extra computational cost.
Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
Gan, Feichen, Lu, Youcun, Zhang, Yingying, Liu, Yukun
Reliable uncertainty quantification is crucial for reinforcement learning (RL) in high-stakes settings. We propose a unified conformal prediction framework for infinite-horizon policy evaluation that constructs distribution-free prediction intervals {for returns} in both on-policy and off-policy settings. Our method integrates distributional RL with conformal calibration, addressing challenges such as unobserved returns, temporal dependencies, and distributional shifts. We propose a modular pseudo-return construction based on truncated rollouts and a time-aware calibration strategy using experience replay and weighted subsampling. These innovations mitigate model bias and restore approximate exchangeability, enabling uncertainty quantification even under policy shifts. Our theoretical analysis provides coverage guarantees that account for model misspecification and importance weight estimation. Empirical results, including experiments in synthetic and benchmark environments like Mountain Car, show that our method significantly improves coverage and reliability over standard distributional RL baselines.
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
Prediction-Augmented Trees for Reliable Statistical Inference
Kher, Vikram, Oikonomou, Argyris, Zampetakis, Manolis
The remarkable success of machine learning (ML) in predictive tasks has led scientists to incorporate ML predictions as a core component of the scientific discovery pipeline. This was exemplified by the landmark achievement of AlphaFold (Jumper et al. (2021)). In this paper, we study how ML predictions can be safely used in statistical analysis of data towards scientific discovery. In particular, we follow the framework introduced by Angelopoulos et al. (2023). In this framework, we assume access to a small set of $n$ gold-standard labeled samples, a much larger set of $N$ unlabeled samples, and a ML model that can be used to impute the labels of the unlabeled data points. We introduce two new learning-augmented estimators: (1) Prediction-Augmented Residual Tree (PART), and (2) Prediction-Augmented Quadrature (PAQ). Both estimators have significant advantages over existing estimators like PPI and PPI++ introduced by Angelopoulos et al. (2023) and Angelopoulos et al. (2024), respectively. PART is a decision-tree based estimator built using a greedy criterion. We first characterize PART's asymptotic distribution and demonstrate how to construct valid confidence intervals. Then we show that PART outperforms existing methods in real-world datasets from ecology, astronomy, and census reports, among other domains. This leads to estimators with higher confidence, which is the result of using both the gold-standard samples and the machine learning predictions. Finally, we provide a formal proof of the advantage of PART by exploring PAQ, an estimation that arises when considering the limit of PART when the depth its tree grows to infinity. Under appropriate assumptions in the input data we show that the variance of PAQ shrinks at rate of $O(N^{-1} + n^{-4})$, improving significantly on the $O(N^{-1}+n^{-1})$ rate of existing methods.
- North America > United States > California (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- North America > United States > California > Orange County > Irvine (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games > Chess (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)